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Abstract 



We consider a multi-cell MIMO downlink (network MIMO) where B base-stations (BS) with M 
antennas connected to a central station (CS) serve K single-antenna user terminals (UT). Although 
many works have shown the potential benefits of network MIMO, the conclusion critically depends 
on the underlying assumptions such as channel state information at transmitters (CSIT) and backhaul 
links. In this paper, by focusing on the impact of partial CSIT, we propose an outage-efficient strategy. 
Namely, with side information of all UT's messages and local CSIT, each BS applies zero-forcing (ZF) 
beamforming in a distributed manner. For a small number of UTs (K < M), the ZF beamforming 
creates K parallel MISO channels. Based on the statistical knowledge of these parallel channels, the CS 
performs a robust power allocation that simultaneously minimizes the outage probability of all UTs and 
achieves a diversity gain of B(AI — K + 1) per UT. With a large number of UTs (K > M), we propose 
a so-called distributed diversity scheduling (DDS) scheme to select a subset of K UTs with limited 
backhaul communication. It is proved that DDS achieves a diversity gain of B-~(M — K + 1), which 
scales optimally with the number of cooperative BSs B as well as UTs. Numerical results confirm that 
even under realistic assumptions such as partial CSIT and limited backhaul communications, network 
MIMO can offer high data rates with a sufficient reliability to individual UTs. 
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Fig. 1. A multicell downlink with B = 3 BSs and K = 3 UTs. 



I. Introduction 

Recently, network MIMO schemes, where neighboring Base-Stations (BSs) are connected to form an 
antenna array, have been proposed as a means to drastically increase the downlink capacity and solve the 
interference management problem of cellular systems [H. Inspired by this result, we consider the multi- 
cell MIMO downlink where B BSs with M antennas connected to a Central Station (CS) communicate 
simultaneously with K User Terminals (UTs) with a single antenna each. FigfT] illustrates an example 
of a multi-cell downlink system for B = K = 3 and M = 3. The channel at hand is modeled by the 

Multi-Input Single-Output (MISO) interference channel, defined by 

B 

y k [t] = J>I X *M + n k [t] (1) 
i=i 

for t = 1, . . . , T, where yk[t] is the channel output at UT k, hg denotes the channel vector from BS i 
to UT k, n k [t] ~ Ne(0, 1) is the Additive White Gaussian Noise (AWGN), and X; G C A/xl denotes the 
input vector transmitted by BS i subject to the average power constraint Pj. 

If the CS or equivalently all the BSs have perfect Channel State Information at the Transmitter (CSIT) 
and share the messages of all UTs, the channel at hand falls down into a classical BM x K MIMO 
broadcast channel with per-BS power constraints. In this case, the optimal strategy to maximize the 
multi-cell throughput is joint dirty -paper coding 12, O, (3]. In order to capture the essential features 
of the multi-cell systems while enabling the analysis tractable, the Wyner model 01 has been widely 
considered in the literature. In ||5l, the authors provide a survey on the information theoretic results on 
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the multi-cell systems under the Wyner model for the Gaussian and fading channels with a single-antenna 
BS (M = 1). These include the per-cell downlink capacity based on the circular Wyner model 0, 0, 
|P71 and the corresponding analysis in the different asymptotic regime such as high SNR, a large number 
of BSs, UTs Q, O. If each BS is equipped with multiple antennas (M > 1), the multi-cell downlink 
capacity can be naturally enhanced by exploiting the spatial degrees of freedom (see for example f8j, |[9l, 
ifTOlO . For a small B, the MIMO multi-cell downlink channel is also referred to the MIMO interference 
channel or MIMO-X channel under various message sharing assumptions (see (H, ifTTl and references 
therein). In these contributions, the sum degrees of freedom has been extensively studied. 

Unfortunately, the global joint processing at the CS is difficult (if not impossible) in practice. This 
calls for practical network MIMO schemes which build on distributed processing at each BS by explicitly 
taking into account realistic aspects. A large number of recent contributions have been focused on 
practical designs by considering the following main limitations in network MIMO (see e.g. |[T2l . ff3l . 
lfl4ll . lfT31 . lfT6l . IfTTl . lfl"8l and references therein). First, a substantial amount of resource needs to be 
dedicated for the CS to obtain accurate CSI. In particular, this overhead increases significantly with the 
number of cooperative BSs, which in turn leaves few resource for the data transmission within a fixed 
coherence block. In |[T2l . such tradeoff between the benefits of network MIMO and the overhead in 
channel estimation has been studied for the case of the uplink with M = 1 and B = K. Second, the 
backbone links between the BSs and the CS are typically imperfect. They might be the capacity-limited 
lfl"3ll . lfl4ll . ITT31 . erroneous, or delayed ifToll . This backhaul imperfectness will prevent the BSs from fully 
sharing the side information on the messages or CSI. Similar effects may occur when only adjacent BSs 
are connected to each other and exchange side information IfTTl . lfl"8l . 

Our contribution is no exception. We aim to design a practical scheme which ensures high data rates 
with sufficient reliability to individual UTs under partial CSIT The objective at hand is relevant to most 
of the current/next wireless standards lfl9ll , |[20l . To this end, we assume that each BS i has local CSIT 
while the CS has only statistical CSIT. Notice that the last assumption is reasonable because the CS 
needs to track the downlink channels at a rate much slower than the coherence time. We consider that 
the CS generates the messages destined to all K UTs and sends them to the B BS over the backbone 
link so that each BS locally encodes the messages and transmits. In order to concentrate on the impact 
of partial CSIT, we do not take into account other practical limitations. In particular, the underlying 
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backbone links are assumed to be perfect such that B BSs can fully share the messaged First, we 
address the case where the number of UTs is smaller than the number of transmit antennas, i.e., K < M. 
In this case, each BS applies ZF beamforming in a distributed manner, which creates K parallel MISO 
channels. Based on the statistical knowledge of these parallel channels, the CS performs a robust power 
allocation that simultaneously minimizes the outage probability of K UTs and achieves a diversity gain 
of B{M — K + 1) per UT. Next, we consider a more relevant case of a large number of UTs (K > M). 
In this case, we propose a simple scheduling scheme called distributed diversity scheduling (DDS) which 
efficiently chooses a set of K < M UTs while limiting the amount of the backhaul communication. More 
precisely, each BS i with local CSIT knowledge chooses its best set of UTs over the predefined partition 
and reports the corresponding index and value to the CS. The CS then decides and informs the selected 
set to all BSs. Finally, the selected UTs are served exactly in the same manner as the previous case of 
K < M. It is proved that DDS offers a diversity gain of B^ (M - K + 1) to each UT and moreover this 
gain scales optimally in B,K, and M, respectively. We remark that a similar CSIT assumption (local 
CSIT at BSs and statistical CSIT at the CS) has been also considered in l2ll . However, the objective of 
this contribution is maximizing the multi-cell throughput rather than minimizing the outage probability 
as we address here. 

Numerical results validate the analysis in terms of diversity gain and show that our proposed distributed 
ZF beamforming significantly outperforms the non-cooperative scheme. Namely, the outage performance 
of our proposed scheme improves with the number of cooperative BS, transmit antennas, and UTs. It is 
also shown in a simple one-dimensional topology that the performance gain can be emphasized as the 
path-loss between neighboring cells decreases. The main finding of this paper is that even in a realistic 
scenario with partial CSIT network MIMO can be beneficial by providing high data rate with a sufficient 
reliability to individual UTs. Such merit of network MIMO has been somehow overlooked in most of 
existing works assuming perfect CSIT. 

The rest of the paper is organized as follows. The following section describes the system model. In 
Section [III] we present the power allocation policies that minimize the outage probability both under 
perfect and statistical CSI at the CS. We consider the relevant case of K > M in Section ITVl where we 
propose a simple user scheduling algorithm and provide its diversity analysis. Section [V] shows some 
numerical results and Section [VT] concludes the paper. 

'if some backbone links are in failure, the corresponding BSs fail to encode the messages to K users. This will reduce the 
effective number of BSs. 
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Notations: Throughout the paper, we use boldface lower case letters to denote vectors, boldface 
capital letters to denote matrices. (■)*, (-) T , and (-) H respectively denote the complex conjugation, matrix 
transposition, and Hermitian transposition operations. I n and nxm represent the n x n identity matrix 
and n x m zero matrix. The determinant, rank, trace, and Frobenius norm of a matrix A are denoted by 
|A|, rank(A), tr(A), and ||A||p, respectively. The dot-equality stands for the near-zero equality, that is, 
/(e) = e n means lim^o '"^g^ = n. We let Xm denote the chi-square distribution with m degrees of 
freedom. 

II. Distributed Zero-Forcing Beamforming 

In order to split the processing at the CS and the BSs, it is reasonable to assume that the CS generates the 
messages and performs the resource allocation whereas each BS encodes and then transmits the symbols 
in a distributed fashion. More precisely, we assume that the CS broadcasts K messages intended to K 
UTs which enable the BSs to cooperate in transmission. Then, each BS i encodes these messages into 
KT symbols {sifcfi]} for k = 1, . . . , K and t = 1, . . . , T by some capacity-achieving space-time coding 
(with a sufficiently large T channel uses). Under this setting, this section focuses on a distributed precoder 
design at each BS which requires only local CSIT {h ifc }f =1 for any i. 

In the multi-cell downlink channel (Q]), we model the vector hjfc of channel coefficients from BS i 
to UT k Gaussian distributed ~ Ne(0, ct^Im) where the variance a ik captures the path-loss of the 
corresponding link assuming that the UTs are arbitrary distributed. Furthermore, {hjfc} are assumed to 
be i.i.d. over any pair i, k. We start with the definition of zero-forcing beamforming vectors as well as 
a useful lemma on the resulting channel statistics. 

Definition 1 (Zero-forcing beamforming vectors): For a channel matrix H £ C^ xA/ with K < M 
linearly independent row vectors ,...,h|£, there exists a zero-forcing beamforming matrix G = 
[Si> • • • i Sk] ^ C MxK , composed by K column vectors gi, . . . , gx, which is defined as 

G = H+diag {a k } (2) 

where H+ = (HH F ) and diag {a k } is a diagonal matrix that normalizes the norm of the columns 
of H + such that ||gfc|| = 1 for any k. 

Lemma 1: If H E <^KxM j ias y ^ entries where each row hf ~ N e (0,a k I), then | a k \ 2 as defined 
in (fSJ) is x\m-k+i) distributed with mean E[|afc| 2 ] = a k , V/c. 

We consider a simple ZF beamforming which enables each UT to achieve a multiplexing gain of one. 
At each channel use t, BS i applies ZF beamforming to transmit K symbols in a distributed manner. 
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Due to the symmetry over all channel uses, we ignore the index t hereafter. We form the transmit vector 
of BS i at any channel use as 

K 

Xj = GjUi = ^2 VPikEikSik (3) 
fc=i 

where we let Gj = [gji, ■ ■ ■ ,gii<] be the ZF beamforming matrix corresponding to the channel matrix 
from BS i formed by row vectors h|[, . . . , h.f K , Uj = [^/pjjsn, . . . , ^JViKSiK^ is the vector of symbols 
where s; lk ~ Ng(0, 1) denotes the symbol transmitted by BS i to UT k with power p ik . With this 
beamforming, the received signal at UT k is given by 

B 

Vk = VmaikSik + n fc (4) 

where = h^gjfc denotes the overall channel from BS i to UT k and coincides with the definition in 
(0) for each i. From Lemma Q] and the independence of {dik} over i, we remark that the original B x K 
MISO interference channel £[]) is decoupled into K parallel B x 1 MISO channels. 

Lemma 2: Let y = Hx + z denote a MIMO slow fading channel with standard complex Gaussian 
noise z. If we have 

Pr{||H||i < e} = e d (5) 

then the maximum diversity order of the channel is d. 

Theorem 1: With distributed ZF beamforming, the diversity order of each UT is B(M — K + 1). 
Proof: Appendix lAl ■ 

Assuming that each UT k perfectly knows the channel state a fe = (aife, ■ ■ ■ , ctsfc), it decodes the 
space-time code and achieves the following rate 

Rk = log ^1 + ^2 \ a ik\ 2 Pik^ ■ (6) 

The capacity region of the K parallel MISO channels (0]) for & fixed set of power p k = (pit, ■ ■ ■ ,PBk) 
and channel state a fc = (a\k, ■ ■ ■ ,dBk) f° r ai l k is given by 

#(a; p) = |r G : R k < log ( 1 + ^ \a ik \ 2 Pll }j Vfc j (7) 

where we let a = {a fc },p = {p fc } for a notation simplicity. The above region is clearly convex 
(rectangular for K = 2). The capacity region of the K parallel MISO channels (0]) under the individual 
BS power constraints P = (Pi, . . . , Pb) for & fixed channel state a is given by 

e(a;P)=U%P) (8) 
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where CP denotes a power allocation policy a i — > p that maps the channel state a into the power vector 
p with component Tjfc(a) = pi k , 3" denotes the feasibility set satisfying X^fcLi^fc(a) < -Pi,Vi for 
any channel realization a. The capacity region C(a; P) is convex and its boundary can be explicitly 
characterized by solving the weighted sum rate maximization as specified in subsection MI- A I 

III. Power allocation minimizing outage probability 

In the previous section, we considered the distributed processing at each BS assuming that the power 
allocation is already done at the CS and each BS is informed about the resulting power partition. Here, we 
address the power allocation problem solved at the CS. In particular, we are interested in a robust power 
allocation strategy which requires only statistical CSIT. To this end, we consider a relevant scenario where 
the system imposes a set of target rate to each UT depending on its application. This is of typical of the 
current standards such as WiMax EUl and LTE 11191 . One of the important goals in such a situation is to 
minimize the outage probability simultaneously for all UTs. In order to formalize the problem, we let j k 
denote the target rate of UT k and form the target rate tuple 7 = (71, ... , 7^). For a given 7, we define 
the outage probability as the average probability that 7 is not fulfilled by all K UTs simultaneously, 
namely 

Pout (7) = l-E a [l{ 7 ee(a,P)}] 

= l-Pr( 7 ee(a,P)). (9) 

This section provides the power allocation policies minimizing the outage probability defined above under 
perfect and statistical CSIT. 

A. Perfect CSI at CS 

We start with a special case where the CS has perfect CSIT. In this case, we are particularly interested 
in the power allocation policy that provides the rate tuple proportional to the target rate tuple (rate- 
balancing). As seen shortly, this policy equalizes the individual outage probability of all UTs and thus 
provides the strict fairness among UTs. The latter is one of the most desired properties. Our objective is 
find the set of {pik} satisfying 

— — — = — = a k , k = 2, . . . , K (10) 
-Ri(p) 7i 

where we defined ^ = a k and a\ = 1. More precisely, the power allocation is a solution of ll22l 

min max "S^ 9 k ^-. (11) 

E fc e fc =iRee(a,p)^ a k 
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Notice that the inner problem is the weighted sum rate maximization for a fixed weight set, while the outer 
problem with respect to 02,...,9k is a convex problem for which a (K — 1) -dimensional subgradient 
method can be suitably applied ||23l . First, we describe a numerical method to solve the inner problem, 
given by 



max 



K 

£ 

k=l 



B 



Wk log 1 + ^ 



a ik\ Pik 



(12) 



i=i 



where = 6 k /a k denotes a non-negative weight for all k that we assume given for the time being. 
Since the above problem is convex (the objective function is concave and the constraints are linear), it 
is necessary and sufficient to solve the KKT condition [24l, given by 



w k \a ik \ 



1 + Ef=i \ a jk\ 2 Pjk 



,K, i = l, 



B 



(13) 



where /ij denotes the Lagrangian variable to be determined such that YlkPik ^ Pi- Unfortunately solving 
(fT3l) directly seems intractable. Nevertheless, the following waterfilling approach inspired by the iterative 
multiuser waterfilling for the MIMO multiple access channel |[25l solves the KKT conditions iteratively. 
Algorithm Al : Iterative waterfilling algorithm for B-BS weighted sum rate maximization 



(0) 



1) Initialize p> ; = for i = 1, . . . , B and c\ 

2) At each iteration n 
For % = 1, . . . , B 

• Compute 4° = \ajk\ 2 p% ] 



for all i, k. 



in) 

Waterfilling step : let p- be 



(n) 
Pik 



1 + c 



(n) 
ik 



m k \ 



Vk 



(14) 



(n) 
°ik 



Pi, 



where ^ is determined such that Yl k =i PI 
End 

3) Continue until convergence 

We have the following remarks on the proposed algorithm. 

Remark 3.1: Algorithm Al is a generalization of the classical waterfilling algorithm for the K 
parallel channels under the total power constraint to the case with B transmitters with individual power 
constraints. Indeed, for a single-BS case (B = 1), the objective function reduces to the weighted sum 
rate of the K parallel non-interfered channels given by 



K 



max V" w k log (l + \a k \ 2 qk) ■ 

K=l 



(15) 
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Remark 3.2: The set of powers in (fl4b correspond to the solution to the new objective function 

(n) , /. Kfc| 2 <?fc\ 

p * = w K>»TX<p^ Wklog [ l + T^ I (16) 

It can be easily seen that this new problem and the original problem (fl2l yield the same KKT conditions 
(S3 for = V +i \ a jk\ 2 Pjk- In otner words, the solution d) of BS i corresponds to treating {q fc }f =1 , 
function of the powers {pj}j^i of the other BSs, as additional noise (constant), i.e., as if they did not 
depend on p,. Under the individual power constraints for each BS, the sequential iteration over different 
BSs probably converges (see the convergence proof below). 

By a rather straightforward extension of |[25l Theorem 1, Theorem 2], we have the following conver- 
gence result. 

Theorem 2: Algorithm Al converges to the optimal solution of the weighted sum rate maximization 
CEl). 

Proof: The proof follows in the footsteps of the proofs of 1251 Theorem 1, Theorem 2]. At each 
iteration, the proposed iterative algorithm finds the solution to the weighted sum rate maximization of the 
single-BS parallel channels ( fT5l ) for each BS while treating the powers of all the other BSs as additional 
noise. Comparing the objective of the single-BS parallel channels (fl5l ) and that of the multi-BS parallel 
channels ( fl6l ), they differ only from the constant Cik({pjk}j^i)- Hence, the objective is nondecreasing at 
each waterfilling step and further converges to a limit. The powers pi, . . . ,ps of B BSs also converge 
to the limit p*, . . . , p^. This is because the solution to the single-BS parallel channels is unique. ■ 
The outer problem consists of minimizing the solution of the inner problem with respect to 02, ■ ■ ■ , Ok- 
This can be done by the subgradient method. We summarize the overall algorithm as follows. 
Rate balancing algorithm 

1) Initialize ef ] G [0, 1] for k > 1. 

2) At iteration n, compute 

RW=arg max VflJ^ (17) 
fc R G e(a,P)^ k a k 

via waterfilling approach. 
Compute a subgradient of user k 

A? 

3) Update the weights by subgradient : 

0(n+l) 



R 



(n) -s„AW 
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where s n denotes the step size at iteration n that can be chosen according to a decreasing rule. 
The overall algorithm, the inner problem solved by algorithm Al and the outer problem solved by 
the subgradient method, implements the rate-balancing by allocating the rates proportional to the target 
rate tuple. Hereafter, we let R* k (a) denote the rate of user k allocated by the rate-balancing algorithm in 
channel state a. Notice that the achievable rate © is a deterministic function of [P*(a). Then, we have 
the following result. 

Theorem 3: The rate-balancing power allocation minimizes the outage probability. Moreover, it equal- 
izes the individual outage probability of all K UT by letting Pr(7i < R\{a)) = • • • = Pr(7fc < R* k (a)). 
Proof: Appendix iBl ■ 

B. Statistical CSI at CS 

We consider a more realistic case when the CS has only statistical knowledge of the equivalent channels 
a. Formally, we define the power allocation policy J> stat : a \- > p as a function mapping the variances of 
channel coefficients a = (an, . . . , obk) into the power vector p with component !J^j: at (cr) = p ik . Since 
the equivalent channel coefficients {a^} after ZF beamforming are correlated over different k for each 
BS i, the individual outage events are dependent. Nevertheless, this dependency can be made arbitrarily 
small by a simple interleaving over frequency bands for example. Under this assumption, we approximate 
the outage probability for a fixed power allocation p as 



A' 



app— out 



where we let A k (p k ) = Ylf=i \ a ik\' 2 Pik- First, we remark that for a fixed power p fe = (pi k , ■ ■ ■ ,PBk) of 
UT k , Afc(p fc ) is a Hermitian quadratic form of complex Gaussian random variables given by 



( 7 ,P) = 1- IPr(A fc (p fc )>2^-l 



(18) 



fe=i 



A fc ( P fc 



! a Bk) 



(W lfc . . . W Bfc ) 



Plk 



\ 



V 



' aik \ 



PBk J 
\ 



\ a Bk J 
( Wife \ 



y p sfc I j \ w sfc j 

where the second equality follows by replacing a chi-square random variable \aik\ 2 ~ X^az-a+i) 
with ||wjfc|| 2 where Wjfe ~ Ne(0, m ^ +1 Im~k+i)- The individual outage probability that UT k cannot 
support 7^ for a fixed p fc is 



Pr(A,(p fc ) < Ck ) = — 

^"3 Jc—joo s 



ds 



(19) 
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where we let Cfc = 2 7fc — 1 denote the target SNR of UT k, and the Laplace transform of Afc(p fc ) is given 
by 

B 1 

$A fc (p*)(g) = TT n | - B W-JT+V (20) 
j = l ^ x t "Af-K+l/ 

From (1201) . we see immediately that each UT achieves a diversity gain of B(M — K + 1), which agrees 
well with Theorem [TJ The widely used upper bound is the Chernoff bound for fixed powers p k given by 

Pr(A fc (p fc ) < Cfc ) < mine Ac ^ A(f{pfc) (A) i F(c k ,p k ). (21) 

Using the last expression of the Chernoff upper bound for each k, the approximated outage probability 
for a fixed p is upper-bounded by 

K 

Cfc,p )) (22) 

k=l 

Since the power optimization based on the exact outage probability is not amenable, we search the power 
allocation that minimizes the Chernoff upper bound, equivalently solves 

K 

/({A fc }, {m}) = H(l - h k (\ k ,v k )) (23) 



maximize 



J'=l 



A 



where we define 



subject to 'Yl/Pik < Pi: i = l,...,B 

k=l 

A fc >0, k = l,...,K 

Pik>0, i = l,...,B,k = l,...,K 



n^i(i + Pik\kPik) M - R+i 

where = M _^- +1 - If /ifc > 1 for some k, the objective becomes null regardless of the power allocation. 
In this case a reasonable choice is to let p ik = 0, Vi for such k and equally allocate the power to {pik 1 } 
for k' 7^ k. In the following, by focusing on the case h k < 1 for all k, we provide an efficient numerical 
method to solve the problem (|23l . We remark first that the maximization of / with respect to {Afc} can 
be decoupled into the minimization of h k over Afc for each k, where h k is convex in A&. Moreover, since 
/ is concave in {pik}, the overall problem is convex. 

Minimization of hk over Afc It can be easily verified that hk is monotonically decreasing in Afc. The 
optimal Afc for a fixed set of powers is the solution of 

B Ft 
Cfc _ ^->. PikPik 

M - K + 1 " ^ 1 + AfcAfc^fc 
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which is a polynomial of degree B. For B = 2, the solution is given in a closed form. 

M-K+l 1 ;f v^2 o _ n 

otherwise 



A 



Cfc /3lfcPlfc+^2fcP2fc ' ^-'J = 



Maximization of / over Since / is concave in {pik}, we form the Lagrangian function by 

introducing B Lagrangian multipliers {/ij} each of which is associated to the power constraint of BS i. 

By arranging the term common for all k, we obtain the KKT conditions for k = 1, . . . , K 

h k (p k ) AfcAfc _ 
1 - h k (p k ) 1 +pikf3ik^k 

When treating p^, . . . ,Pi-i k>Pi+i,ki ■ ■ ■ ,PB,k fixed, the LHS of d27T ), denoted by 0^, is a strictly positive 
and monotonically decreasing function of p^ (since we exclude the case = 0). It remains to determine 
Hi such that the power constraint of BS i is satisfied , i.e., pa + - ■ ■ +p%k = Pi- When treating the powers 
of the other BSs j/i fixed, the powers p.; of BS i can be found by a simple line search of Hi- 

The following summarizes our proposed iterative algorithm to minimize the Chernoff upper bound, 
equivalently solve (|23T ). 

Algorithm A2 : iterative algorithm for the Chernoff upper bound minimization 

1) Initialize p(°) 

2) At iteration n 
For i = 1, . . . , B 

• Update by solving the polynomial (|25l ) 



M 

• Find the new power vector p,- of BS i by line search 
End 

3) Continue until converge / 

Although we are unable to provide a formal proof, we conjecture that Algorithm A2 converges to its 

In) 

optimal solution. At each iteration, X) is determined as a unique solution for all k and a fixed set of 



powers. Regarding the power iteration, since the objective (1271) is concave in pi k when fixing all other 
powers, a sequential update of the powers pi, P2, • • • , Pb, Pi - shall converge under individual BS power 
constraints by a similar argument as the proof of Theorem |2] 

IV. Effect of user scheduling 

In this section we address the relevant case when the number of UTs is larger than the number of 
transmit antennas K > M. In order for each BS to apply the ZF beamforming in a distributed fashion, 
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a set of K < M UTs shall be selected beforehand. We assume that the user selection (scheduling) is 
handled by the CS together with the power allocation. In particular, we focus on a user selection method 
which achieves a high diversity order while limiting the amount of the side information necessary at the 
CS and the BSs. In the following, we present our proposed user selection scheme as well as the analysis 
on its achievable diversity gain. 

A. Distributed Diversity Scheduling (DDS) 

Let S,lt denote the set of all K users, the K selected users, with |S| = K, |tt| = K, respectively. Let 
us also define Q(K) as the set of all possible user selections, i.e., Q(K) = {it | UC§, \U\ =R} for 
K < M ,^ Then, the equivalent channel from the BSs to the selected users is 

y k = a k u k + z k , k G It, (28) 

which is a MISO channel with a fc = [out " " " o-Bk] an d uk = [\/PikSik ■ ■ ■ y/PBkSBk] T • For convenience, 
we only consider the diversity order of the worst user and refer it as the diversity of the system hereafter. 
Since the diversity order of a given channel depends solely on the Euclidean norm of the channel matrix, 
as shown in lemma [2 the following user selection scheme maximizes the diversity of the system 

It* = arg ir maxmin ||a fc || 2 . (29) 

1X6Q keu 

Unfortunately, this scheduling scheme has two major drawbacks: 1) perfect knowledge at the CS on 
{a^}, crucial for the scheduling, is hardly implementable as aforementioned, and 2) the maximization 
over all \Q(K)\ = (|) possible sets U grows in polynomial time with K. 
To overcome the first drawback, we use the following selection scheme 

ltd = argn , max max min I 1 2 . (30) 

i=l...B UGQ fcGU 

That is, BS i finds out the set It that maximizes minfcgu |aifc| 2 and sends both the index of the set and 
the corresponding maximum value to the CS. Upon the reception of B values and the corresponding sets 
from the B BS, the CS makes a decision by selecting the largest one. Therefore, only partial channel 
state information is communicated in the BS-CS link. To address the second drawback, we narrow down 
the choices of It to the following k = K/K possibilitiej^l 

?s = {Ui,it 2 ,...,u K }, Uu, = s, u i nu i = 0, VztM, \Ui\=k, ml 

i 

2 For convenience of notation, we will drop the argument K whenever confusion is unlikely. 

3 Here, we assume that K/K is integer for simplicity of demonstration. However, it will be shown that same conclusion holds 
otherwise. 
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Fig. 2. An example scenario of user scheduling with two BSs and six UTs. 

In other words, !P§ is partition of the set of all users S. Furthermore, it is assumed that the partition T§ 
is fixed by the CS and known to all BSs. Hence, the proposed scheduling scheme selects the following 
set of users 

lid = arg 1( max max mini a^l 2 . (31) 
!=l...BUe?s keu 

To summarize, the scheduling scheme works as follows 

i) The CS fixes a partition !?§ and informs it to all BSs. 

ii) BS i finds maxix g y s min^gu; |oifc| 2 > and sends this value and the index of the maximizing set XL to 
the CS. 

iii) The CS chooses the largest value and broadcasts the index of the winner set lid as defined in (|3"TI ). 

iv) All the BSs serve simultaneously the UTs in Ud- 

An example of two BSs and six UTs is shown in Fig. [2] In this example, in order to serve two 
UTs simultaneously, a partition of three sets is fixed by the CS. With local CSI, each BS compares the 
coefficients min^u \ ai^\ 2 for all three sets U, finds out the largest one, and sends the corresponding 
"index(value)" pair to the CS. The CSI compares the values and broadcasts the index of the winning 
set (set 1 in this example). 

B. Diversity gain analysis 

In this subsection, we analyze the diversity gain achieved by our proposed DDS and compare it with 
the upper bound. The result is summarized in the following Theorem. 

Theorem 4: Let K, B, and M denote the number of UTs, number of BSs, and number of antennas 



January 14, 2010 



DRAFT 



15 



per BS. By serving K UTs simultaneously, the following diversity gain is achievable with DDS 

d Ud (K)>B^(M-K + l), (32) 



K 

for K = nK with an integer n. Furthermore, the optimal diversity gain achieved by ( f29T > is upper-bounded 
by 

d u *(K)<B(K-K + l)(M-K + l\. (33) 

Proof: Appendix O ■ 
Remark 4.1: For a fixed K, the diversity gain of the proposed scheduling scheme grows as O(BKM), 
i.e., the optimal diversity scaling with B, K, and M. In this sense, the proposed scheme is order 
optimal in terms of diversity gain. Also note that the lower bound (l32l and the upper bound (1331) 
coincide in some specific settings. First, c?u d (l) = du*(l) = BKM, V B, K, M. That is, the proposed 
scheme is diversity optimal if only one user is served in the system. Then, for K < M, we have 
du d (K) = dw{K) = B(M — K + 1). This corresponds to the case where all users in the system are 
served simultaneously. 

Remark 4.2: Interestingly, exactly the same diversity order is achieved for IX* if we set Q = CPs. 
To see this, let us rewrite 

min ||a fc || 2 = max min ||a fc || 2 (34) 

fceii* ue^s keu 

< max ||a fc || 2 , Vfc G U (35) 



and that maxu; e ;p s \\ar\f is of diversity Bj^ ( M + 1 - K 



Remark 4.3: When K does not divide K, we consider only 



K out of K users. Since K 



divides 



K, the following diversity gain can be achieved 



du A {K) = B 



K 

k 



M-K + l) (36) 



with the proposed DDS. 



V. Numerical Examples 



This section provides some numerical examples to verify the behavior of our proposed distributed ZF 
beamforming scheme in a simple network MIMO configuration with B = 2 cooperative BSs. We assume 
the same power constraint at both BSs Pi = P2 and let P denote the SNR. 

Fig. [5] shows the outage probability performance versus SNR for K = 2 and M = 2,4. The target rate 
is fixed to (71,72) = (3,1) bit/channel use, and we let a,^ = 1 for all i,k. We compare the different 
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* 2 



UT2 



UT1 



Fig. 3. One dimensional configuration. 

power allocation strategies, algorithm Al with perfect CSIT, algorithm A2 with statistical CSIT, and 
equal power allocation (pn = pi2 = P/2 for i = 1, 2). For the sake of comparison, we also consider the 
case without network MIMO (no message sharing) where each BS sends a message to its corresponding 
UT in a distributed fashion. In order to make the comparison fair in terms of complexity, we let each BS 
i send the symbol s, by ZF beamforming, i.e. x.; = y/PgiSi where gj is a unit-norm vector orthogonal 
to hjfc for k 7^ i. From Lemma [T] such a system offers a diversity order of M — K + 1 for each UT. As 
expected from Theorem CD we observe that our BS cooperation schemes enables to achieve a diversity 
gain of 2(M — K + 1), i.e. 2, 6 with M = 2, 4, respectively. These gains are twice as large as the case 
without network MIMO. Moreover, the proposed algorithms provide a significant power gain compared 
to equal power allocation. 

In Fig. |6l we plot the individual outage probability such that each UT k cannot support its target rate 
7^ under the same setting as Fig. [5] for M = 2. With perfect CSIT, our proposed waterfilling allocation 
Al guarantees the identical outage probability for both UTs by offering the strict fairness. This agrees 
well with the second part of Theorem [3] Under statistical CSIT, algorithm A2 provides a better outage 
probability to UT 1 but keeps the gap between two UTs smaller than the equal power allocation. 

In order to evaluate the impact of asymmetric path loss on the outage performance, we consider a 
simple 1-D configuration illustrated in Fig. [3] where UT 2 is located at x = 3 and UT 1 moves from 
x = to x = 2. Assuming that BS 1, 2 is at x = 1, 3, respectively, we vary d\\ = a/1 + (1 — x) 2 , d\i = 
sj\ + (3 — x) 2 while we fix the position of UT2 by letting d\2 = &22 = 1. By taking into account 
the path loss = d~^, we plot the outage probability as a function of the position x of UT 1 in Fig. [7] 
We consider M = 4, SNR P = 10 dB and fix the target rate 71 = 72 = 1 bit/channel use. We observe 
that the proposed distributed ZF scheme provides a significant gain compared to the case without network 
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MIMO especially as UT 1 gets closed to the cell boundary (x = 2). This is because the performance 
without network MIMO only depends on d\\ while the distributed ZF becomes more and more beneficial 
as d,2i decreases. 

Finally, Fig. [8] shows the outage probability versus SNR when we have more users than the number of 
served users , i.e. K > K = 2. Considering the same setting as Fig. [5] for M = 4, we apply distributed 
diversity scheme to select a set of two users among K = 2, 4, 6. Once the user selection is done, any 
power allocation studied in Section [III] can be applied. However, it is non-trivial (if not impossible) to 
characterize the statistics of the overall channel gains in the presence of any user scheduling. Hence, 
we illustrate here only the performance with equal power allocation under statistical CSIT. As a matter 
of fact, any smarter allocation shall perform between the waterfilling allocation and the equal power 
allocation. As expected from Theorem 01 the diversity gain increases significantly as the number K of 
users in the system gets large. 

VI. Conclusions 

We considered the multi-cell downlink system (network MIMO) where B BSs, perfectly connected 
via the reliable backbone links to the CS, wish to communicate simultaneously with K UTs. As one of 
the realistic limitations of network MIMO, we explicitly accounted for partial CSIT, i.e. local channel 
knowledge at each BS and statistical channel knowledge at the CS. Under this setting, we proposed an 
outage-efficient strategy which builds on distributed ZF beamforming to be performed at each BS and 
efficient power allocation algorithms at the CS. For the case of a small number of users K < M, the 
proposed scheme enables each UT to achieve a diversity gain of B(M — K + 1). For the case of many 
users K > M, we proposed distributed diversity scheduling (DDS) which can be implemented in a 
distributed fashion at each BS and requires only limited amount of the backbone communications. We 
also proved that DDS can offer the diversity gain of B%(M — K + 1) and this gain scales optimally 
with the number of cooperative BSs as well as the number of UTs. The main finding is that limited BS 
cooperation can still make network MIMO attractive in the sense that a well designed scheme can offer 
high data rates with sufficient reliability to individual UTs. The proposed scheme can be suitably applied 
to any other interference networks where the transmitters can perfectly share the messages to all UTs 
and a master transmitter can handle the resource allocation. 
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Fig. 4. Rate region C(a; P) and subregion 3? s 



Appendix 

A. Proof of Theorem [7J 

The channel (0]) is a MISO channel defined by a fe = (ai^, . . . ,ask) an d 



Pr <j ||a fc || 2 < e \ = Pr <^ max |a ifc | 2 < e } (37) 



=} = Pr ji 

= l[Pr{\a lk \ 2 <e} (38) 

i 

= (Pr {\a ik \ 2 < e}f (39) 

From lemma |21 the maximum diversity is B(M — K + 1). 

B. Proof of Theorem \3\ 

First we remark that for a given channel realization a the following two cases occur: 

(a) The target rate tuple is outside the region 7 ^ S(a, P) 

(b) The target rate tuple is inside the region 7 £ C(a, P) 

The above two cases are illustrated in FigSfa), (b) respectively for K = 2. For the case (a), we are in 
an outage event regardless of the power allocation. For the case (b), we define the subregion 3? s (a) 

X s (a) = {R|RG C(a,P),i4 > 7fc , k = l,...,K} (41) 

depicted in a shadow area in Fig. |4] (b). We define !P S as a class of the power allocation policies that 
maps a into the rate tuple R inside IR s (a) whenever we are in case (b). We remark that any policy 
belonging to CP S results in a successful transmission, and thus minimizes the outage probability. Since 



January 14, 2010 



DRAFT 



19 

the proposed rate balancing scheme allocates the power (p\, . . . ,p* K ) so that the rate-tuple (R*, . . . , R* K ) 
is proportional to 7 on the boundary of C(a, P) whenever 7 £ C(a, P), it belongs to the class 7 S . This 
establishes the first part. 

We now prove the second part. It is immediate to see that with the rate balancing scheme, we have 
for any a 

l( 7fe < R* k (a)) = l(a fc 7i < a k RZ{&)) = l( 7 i < #t(a)), k = 2,...,K (42) 

where the first equality follows from (TTQb . The outage probability with the rate balancing scheme can be 
always written as 

p balancc (7) = 1 _ Pr(n f =1 { 7fe < tf*(a)}) 

= 1 - Pr( 7fe < i?£(a)) Pr(n feVfe {7fe' < K-i^Mlk < J$(a)) 
= 1 - Pr( 7fe < i$(a)) 

where the last equality follows since the equalities (l42l imply Pr(rifc< ^klh' < -Rfc'( a )l7fc < ^fc( a )) = 1 
for any k. This completes the second part. 

C. Proof of Theorem |4] 

In order to examine the diversity gain of the proposed scheduling scheme, we first remark 

min ||a fc || 2 > min max I 1 2 (43) 

fceUj fceUj i 

> max min I an- 1 2 (44) 

= max max min I 1 2 (45) 

uey s i fceu 

where the (l44l follows from the max-min inequality and the last equality holds since we can rewrite (PTT ) 
as 

Ud = arg 1f max max minlajJ 2 . (46) 
U£? s i=i.. .s fceu 
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by swapping the maximization over IX and that over i. To find the diversity order of the scheme, we need 
to look at the following near-zero behavior of the channel coefficients 

Pr < min ||a fc || 2 < e > < Pr < max maxmin lajfcl 2 < e > (47) 
[fceiid J [ilea's i fceu J 

PriminKfcl 2 < eX) ' ' (48) 



< K]Pr{|a 6fc | 2 <e} (49) 

Vfceti / 

= (|U|Pr{K fe | 2 < e }) B|35s| (50) 

| U | e M + l-^ B l^l (51) 



= e B %( M+1 - R ), (52) 

where (l48l ) follows from the fact that U's are disjoint in CPs and that min^gix |aifc| 2 are independent for 
different II and i; d49l is from the union bound. From lemma |2j (l32l is straightforward. For the upper 
bound of the diversity gain of the scheduling scheme d29l , let us first write 



maxmin ||a fc II 2 < max lla 1 II 2 (53) 

ugq keu ueQ 

< B max max I an 1 2 (54) 

i UGQ 

where the first inequality is from the fact that the worst user cannot be better than the first user; the 
second inequality is from ((a 1 !) 2 < i?maxj |aji| 2 . From ll26l Theorem 1], we know that the diversity 
gain of |oji| 2 is {K — K + 1){M — K + 1). Therefore, it readily follows that the diversity gain of 
max|x G Q miiifcgu ||a fc || 2 is upper-bounded by B{K — K + 1)(M — K + 1). 
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SNR per BS [dB] 

Fig. 5. Outage probability vs. SNR with B = K = 2 and M = 2,4. 
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Fig. 6. Individual outage probabilities vs. SNR with B = K = 2 and M = 2. 
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position x of UT1 



Fig. 7. Outage probability vs. location of UT1 with B — K = 2 and M — 2,4. 
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